Assessment of severity scoring systems for predicting mortality in critically ill patients receiving continuous renal replacement therapy

The incidence of acute kidney injury (AKI) is increasing every year and many patients with AKI admitted to the intensive care unit (ICU) require continuous renal replacement therapy (CRRT). This study compared and analyzed severity scoring systems to assess their suitability in predicting mortality in critically ill patients receiving CRRT. Data from 612 patients receiving CRRT in four ICUs of the Korea University Medical Center between January 2016 and November 2018 were retrospectively collected. The mean age of all patients was 67.6 ± 14.8 years, and the proportion of males was 59.6%. The endpoints were in-hospital mortality and 7-day mortality from the day of CRRT initiation to the date of death. The Program to Improve Care in Acute Renal Disease (PICARD), Demirjian’s, Acute Physiology and Chronic Health Evaluation (APACHE) II, Simplified Acute Physiology Score (SAPS) 3, Sequential Organ Failure Assessment (SOFA), Multiple Organ Dysfunction Score (MODS), and Liano’s scores were used to predict mortality. The in-hospital and 7-day mortality rates in the study population were 72.7% and 45.1%, respectively. The area under the receiver operator characteristic curve (AUROC) revealed the highest discrimination ability for Demirjian’s score (0.770), followed by Liano’s score (0.728) and APACHE II (0.710). The AUROC curves for the SAPS 3, MODS, and PICARD were 0.671, 0.665, and 0.658, respectively. The AUROC of Demirjian’s score was significantly higher than that of the other scores, except for Liano’s score. The Hosmer-Lemeshow test on Demirjian’s score showed a poor fit in our analysis; however, it was more acceptable than general severity scores. Kidney-specific severity scoring systems showed better performance in predicting mortality in critically ill patients receiving CRRT than general severity scoring systems.


Introduction
Acute kidney injury (AKI) occurs in 15%-38% of hospitalized patients, with an in-hospital mortality rate of 23.9%-60.3%, especially in critically ill patients, with a high incidence of up to 74.5% [1][2][3][4][5][6]. Furthermore, AKI causes additional complications and aggravates the underlying disease, leading to increased hospital stay duration and medical costs [1,2]. Despite continued progress in medical technology, the incidence of AKI is increasing every year [7] and has become a major public health concern [8]. Among the patients with AKI admitted to the intensive care unit (ICU), 72.5% required renal replacement therapy and 80% received continuous renal replacement therapy (CRRT) [6], primarily because this therapy is hemodynamically more stable than intermittent hemodialysis and fluid balance can be easily controlled [9].
However, CRRT has the disadvantages of high cost and requiring skilled personnel due to the risks of arrhythmia, bleeding, and hypotension [9][10][11]. Therefore, it is necessary to assess patient severity for predicting prognosis and identifying meaningful information necessary for medical staff to discuss and make correct decisions about patient prognosis, and for providing future treatment directions to patients and care givers [12]. Moreover, predicting the mortality rate of patients admitted to the ICU is critical for assessing the severity of the disease and adjudicating the value of new treatments, interventions, and health care policies [13]. Estimates of mortality risk can be useful for the efficient allocation of resources and judgment of treatment adequacy in medical institutions by comparing actual and expected outcomes [14,15]. The severity scoring system can be divided into assessing the overall health status and measuring severity by focusing on specific organs. Various scoring systems have been developed to predict disease prognosis [16][17][18][19][20][21][22]. Several studies have used severity scoring systems to effectively apply CRRT according to the acuity of illness parameters [23,24]. Additionally, to improve the quality of CRRT, such as the optimal start time of CRRT [25,26], severity scores are used for providing the objectivity and reliability of the study with population stratification and balanced randomization to ensure that disease severity does not affect the statistical outcome [27]. Although other studies have compared the predictive abilities of severity scoring systems in patients with AKI, most studies included all patients diagnosed with AKI who received renal replacement therapy, such as intermittent hemodialysis, and only few focused solely on CRRT [28][29][30][31][32][33][34].
This study aimed to evaluate the predictive ability of severity scoring systems for mortality in critically ill patients receiving CRRT. This study compared and analyzed the Acute Physiology and Chronic Health Evaluation (APACHE) II score, Simplified Acute Physiology Score (SAPS) 3, Sequential Organ Failure Assessment (SOFA) score, and Multiple Organ Dysfunction Score (MODS), which are general severity scoring systems predicting mortality, and Liano's, Program to Improve Care in Acute Renal Disease (PICARD), and Demirjian's scores, which are kidney-specific severity scoring systems. The results of this study could be used as a basis for selecting scoring systems suitable for severity assessment in patients with AKI receiving CRRT, and for developing a new severity scoring system.

Study population
From January 2016 to November 2018, patients who received CRRT in four ICUs at the Korea University of Medicine were selected. The target group included patients with AKI and chronic kidney disease (CKD) who did not undergo dialysis. Patients who had previously undergone hemodialysis or peritoneal dialysis for end-stage renal disease or who had received a kidney transplant were excluded. Of the 768 patients who received CRRT during this period, 612 were included in the final analysis, excluding 119 patients who had previously undergone hemodialysis, 23 who had undergone peritoneal dialysis, and 14 who had kidney transplants.
In cases where CRRT was performed several times during hospitalization, the clinical symptoms and diagnostic test results at the time of initial treatment were investigated.
This study was approved by the Institutional Review Board of the Korea University Anam Hospital (approval number: 2018AN0415). Informed consent was waived by the board as this study was conducted retrospectively and the data were de-identified prior to analysis. All study methods were performed in accordance with relevant guidelines and regulations.

Data
Clinical data of patients receiving CRRT were retrospectively collected from electronic medical records. The patients' age, sex, in-hospital mortality, survival at 7 days from the initiation of CRRT, comorbidities, reason for CRRT, mean arterial pressure (MAP) at the initiation of treatment, use of vasopressors, mechanical ventilation, and laboratory findings, including serum hemoglobin, serum creatinine, albumin, arterial blood gas analysis, and C-reactive protein levels, were investigated.
The scores of APACHE II, SAPS 3, SOFA, MODS, Liano's, PICARD, and Demirjian's were calculated and used to predict mortality. Severity scoring was performed at the initiation of CRRT, and all variables for severity scores were collected within 24 h prior to the initiation of CRRT. The endpoints were in-hospital and 7-day mortality according to the duration from the day of CRRT initiation to the date of death.

Statistical analysis
The general characteristics of the participants were recorded using mean value and standard deviation, and the area under the receiver operating characteristic curve (AUROC) of each severity score was calculated to assess discrimination among the severity scoring systems.
Calibration of the severity score was assessed using the Hosmer-Lemeshow Goodness-of-Fit test. Data analysis was performed using IBM SPSS statistical software, version 23 (SPSS Inc., Chicago, IL, USA), and ROC comparisons were performed using the MedCalc statistical software (MedCalc, Ostend, Belgium). The statistical significance level was set at p-value >0.05.

Baseline characteristics and mortality of the study population
A total of 612 participants were enrolled in the study. The mean age of all participants was 67.6 ± 14.8 years and 59.6% were males ( Table 1). The in-hospital mortality rate was 72.7% and 7-day mortality was 45.1%. Patients with CKD accounted for 11.9% of the total study population, and there was no significant difference in the proportions of survivors and non-survivors. There were no significant differences between survivors and non-survivors in terms of mean age, sex, and frequency of AKI causes. Table 2 shows the clinical test results and acute physiology of the study population at the time of CRRT initiation. pH, serum albumin, and platelet count in non-survivors were significantly lower than those in survivors (p < 0.01 for all).

Severity scores of study population
The mean severity scores for the study population were as follows: APACHE II score, 35.5; SAPS 3, 84.6; SOFA score, 9.0; MODS, 10.7; Liano's score, 0.55; PICARD score, 0.43; and Demirjian's score, 0.60. Table 3 shows a comparison of the severity scores between survivors and non-survivors. There was a significant difference in the mean of all severity scores between survivors and non-survivors (p < 0.01 for all).

Discrimination of each severity scoring system
The AUROCs for in-hospital mortality are shown in Fig 1. AUROC revealed acceptable discrimination ability for Demirjian's score, followed by Liano's score. Table 4 shows the results of the comparison of AUROC between the scoring systems. Demirjian's score was not significantly different from Liano's score but was significantly higher than the rest.

Discussion
This study evaluated and compared the predictive ability of the severity scores of patients who received CRRT. The primary result was that the kidney severity scores performed better than the general severity scores because of comparing the predictive ability between the severity scores. The AUROC for in-hospital mortality revealed acceptable discrimination ability of the Demirjian's score (0.770), followed by the Liano's (0.728) and APACHE II (0.710) scores. Demirjian's score also showed the highest predictive value for 7-day mortality, followed by the Liano's and APACHE II scores. The AUROC comparison showed that Demirjian's score was significantly higher than the other scores except for Liano's score. In addition, the Hosmer-Lemeshow test results of five scores, which provide predicted mortality, showed poor calibration of all scores except for those of Liano's and PICARD. The calibration for 7-day mortality was similar to that for in-hospital mortality. The in-hospital mortality was 72.7% in this study, which is higher than the 23.9%-60.3% due to AKI [1][2][3][4][5][6], implying that among patients with AKI, patients undergoing CRRT have a higher mortality rate. In a previous study of 1738 patients with AKI, 76.2% received mechanical ventilation, 69.1% received vasopressors, and 47.5% had sepsis at the onset of CRRT [6]. The use of mechanical ventilation was lower in this study than in a previous study; however, the use of vasopressors and sepsis rates were higher. These differences may be related to differences in mortality rates. In the non-survivor group, the MBP was lower at the start of CRRT, and mechanical ventilation and vasopressor use were more frequent, indicating that vital signs at the beginning of CRRT were worse. However, the mortality rate of our study population, which was relatively higher than that of other studies, is a potential limitation.
The general severity scores evaluated in our study were lower than the AUROC of 0.7, except for the APACHE II score. The APACHE II score was more discriminative than the other general scores; however, the results of the fitness test showed poor calibration, and the calibration line tended to underestimate mortality.
In previous studies that evaluated mortality based on general severity scores in patients with AKI, discriminant assessments were inconsistent. Passos et al. compared the APACHE II score, SAPS 3, and SOFA score in 186 patients with sepsis who underwent CRRT, and the AUROC showed poor discrimination, with 0.57, 0.48, and 0.58, respectively [33]. A study of 1169 patients with AKI in China from 1996 to 2013 showed that the AUROC of the SOFA score was 0.78 [34]. Of the 731 patients, only 56.1% underwent RRT, and the overall mortality rate was 13.8%, indicating that the severity of the disease was low.
Liano's score was developed by Liano et al. in Spain [20]. The discriminant ability of Liano's score was higher than that of the other scores, except for Demirjian's score in this study. Additionally, the Hosmer-Lemeshow test on Liano's score showed good calibration. Liano's scores have been evaluated for external validity in several previous studies. Uchino et al. conducted a prospective multinational multicenter study of patients with AKI involving 54 medical institutions in 23 countries from 2000 to 2001 [28]. A total of four kidney-specific severity scores (Mehta [35], Liano's, Chertow [36], and Paganini [37]) and two general severity scores (SAPS II and SOFA) were calculated to compare the predictive ability. The AUROC of Liano's score was 0.698, which was more discriminative than the other scores; however, all were less than 0.7. Calibration was poor for all except for Liano's score in this study.  the SOFA score; all were less than an AUROC of 0.7. The results of the goodness-of-fit tests were poor, except for Liano's score. The PICARD score was developed by Chertow et al. based on the Program to Improve Care in Acute Renal Disease (PICARD), a multicenter study of 618 patients with AKI in five U.S. medical institutions from 1999 to 2001 [21]. A formula for predicting 60-day mortality was developed by dividing the time of AKI diagnosis, consultation, and initiation of dialysis. In this study, a prediction formula was applied and analyzed based on the dialysis initiation time, where the AUROC was the highest in internal validation. The PICARD score showed good calibration but the lowest discrimination among the kidney-specific severity scores in our analysis. Discrimination is affected by the distribution of the target group, which is poor in the homogeneous group and good in the heterogeneous group [39]. The predicted mortality rate of PICARD was 43%, indicating its tendency to underestimate the mortality rate. This is thought to be attributed to a 60-day mortality criterion and a low mortality rate of 37% among the populations that developed these scores.
Demirjian's score exhibited the highest discriminative ability. Demirjian's score was developed from the Veterans Affairs/National Institutes of Health Acute Renal Failure trial network study in the United States [22] to predict the 60-day mortality by selecting 21 variables affecting mortality among patients with AKI who received CRRT. The Hosmer-Lemeshow test on Demirjian's score showed a poor fit in our analysis; however, it was more acceptable than general severity scores.
Although this study is limited in that it analyzed retrospectively collected data in a singleinstitution ICU, it has the strength of assessing the mortality predictability of kidney-specific severity scores only in patients who received CRRT. In several previous studies, general severity scores were used for population stratification and balanced randomization to improve the quality of CRRT. For example, Zarbock et al. compared the effect of early and delayed RRT initiation on mortality in critically ill patients with AKI, in which randomization was stratified according to SOFA cardiovascular scores [25]. In the study by Barbar et al. on the timing of RRT in patients with AKI, randomization was performed based on a minimization technique with stratification according to center, age, SOFA score, and type of infection [26]. This study supports the fact that the kidney-specific severity scores have higher discriminative ability than systemic scores in predicting mortality in patients receiving CRRT, and highlights the need to develop more predictable tools for patients with AKI receiving CRRT.
Patients with CKD were included in the study population, except those who received renal replacement therapy, such as intermittent hemodialysis, peritoneal dialysis, and kidney transplantation. Since the focus was on patients receiving CRRT, the study results are unlikely to change due to the characteristics of this cohort; however, the lack of information, such as baseline creatinine or eGFR, is a limitation of this study.

Conclusions
In summary, compared with general severity scores, kidney-specific severity scores demonstrated better calibration and discrimination in predicting mortality in patients with AKI receiving CRRT. However, none of the parameters evaluated in this study exhibited both excellent differentiation and calibration. In conclusion, all severity scoring systems included in this study had a limited ability to predict mortality in critically ill patients requiring CRRT. Therefore, we emphasize the need to develop novel severity scores with good calibration and high discrimination abilities.